feat: add problog rule inference to build as code check #260

sophie-bates · 2023-05-30T01:23:23Z

Priority (v1)

Subchecks

Timestamp of pypi package vs pipeline run - just need to finish with the github_actions function to fetch the 5 most recent runs of a workflow file.
Workflow trigger
Test deployment
Secrets

Note: for many of the sub-checks we've had to do a check for deploy action method vs deploy command as there can be multiple methods used in a workflow. Need to figure out a more streamlined way to handle this.

Later

Update expected outcomes for integration tests.
Address the assumption that a project must have a deployment method found - this limited the inference. However, the challenge associated with this is that most sub-tasks rely on the existence of a particular workflow or workflow step to check. Perhaps could store information per workflow? and build up evidence in this way.
We added tox -e release as a deploy command, however release is just a target pointing to a section in a tox.ini file, so potentially need to investigate parsing these files so verify that it contains deploy commands.
Most of the functionality of Poetry overlaps with pip anyway, and there's little differentiation between the two in terms of project setup - i.e. Poetry projects can still use the same external GHAs etc. to deploy. Could be potential to have generic python class and use the inference to instead make a decision about which is used but not until later in the check.

Notes:

Branch multiple-deploy-commands-collected-test stores some changes from the proof of concept for performing inference of multiple deployment commands (that was used in the Micronaut case study).

Signed-off-by: sophie-bates <[email protected]>

…y values using ProbLog. Signed-off-by: sophie-bates <[email protected]>

…ugh problog inference Signed-off-by: sophie-bates <[email protected]>

…sed subcheck Signed-off-by: sophie-bates <[email protected]>

Signed-off-by: sophie-bates <[email protected]>

tromai · 2023-06-07T04:54:00Z

src/macaron/slsa_analyzer/checks/build_as_code_check.py

+                    print(key, value)
+                    if str(key) == "build_as_code_check":
+                        confidence_score = float(value)
+                results = vars(build_as_code_subchecks.build_as_code_subcheck_results)


It's okay to use this built in function for now. But we should keep in mind that this function would raise TypeError when __dict__ attribute could not be found within the target object.

Thanks for your comment Nhan, I'm actually about to push a change that will mean I'm no longer using this anyway so hopefully it's all good.

tromai · 2023-06-07T04:57:00Z

src/macaron/slsa_analyzer/checks/build_as_code_subchecks.py

+        self.check_results: dict = {}  # Update this with each check.
+        self.ci_info = ci_info
+        self.ci_service = ci_info["service"]
+        self.failed_check = 0.0


I am a bit confused by this attribute failed_check. Could you document what this attribute is used for and what is the value that it contains?
From a first look, I was expecting a boolean type instead of a float. Is this attribute the value that we should return if a sub check fails?

Yes, it's the value that should be returned if the sub check fails.

tromai · 2023-06-07T05:01:42Z

src/macaron/slsa_analyzer/checks/build_as_code_subchecks.py

+        check_certainty = 1.0
+        # If this check has already been run on this repo, return certainty.
+
+        justification: list[str | dict[str, str]] = ["The CI workflow files for this CI service are parsed."]


We could define the justification inside the if branch so that we don't need to do it when self.ci_info["bash_commands"] is not available.

…ermine which results to store Signed-off-by: sophie-bates <[email protected]>

behnazh-w · 2023-06-07T09:14:36Z

src/macaron/slsa_analyzer/checks/build_as_code_check.py

+                # TODO: Investigate using proofs
+
+                # Check whether the confidence score is greater than the minimum threshold for this check.
+                if confidence_score >= self.confidence_score_threshold:


At this line, if a CI service has a confidence_score larger than the threshold, but smaller than other CI services, wouldn't the check return the result for the smaller confidence_score?

True. I can update this implementation to store all of the required information (i.e. deploy commands) for each ci_service and then populate the BuilasAsCodeTable and AnalyzeContext with whichever ci_service has the highest confidence_score?

Sure, however I don't understand why AnalyzeContext needs to be updated?

I just meant where the ci_info information is updated, i.e. here.

Oh, that's the inferred provenance representation. We don't update ci_info itself.

Signed-off-by: sophie-bates <[email protected]>

…lgraph Signed-off-by: sophie-bates <[email protected]>

Signed-off-by: sophie-bates <[email protected]>

…n sub-checks Signed-off-by: sophie-bates <[email protected]>

Signed-off-by: sophie-bates <[email protected]>

…workflow timestamp Signed-off-by: sophie-bates <[email protected]>

Signed-off-by: sophie-bates <[email protected]>

… deployment method Signed-off-by: sophie-bates <[email protected]>

… for trigger event type sub-task Signed-off-by: sophie-bates <[email protected]>

Signed-off-by: sophie-bates <[email protected]>

… for particular event types Signed-off-by: sophie-bates <[email protected]>

chore(deps): add problog dependency to pyproject.toml

9415acc

Signed-off-by: sophie-bates <[email protected]>

sophie-bates requested review from behnazh-w and tromai as code owners May 30, 2023 01:23

oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label May 30, 2023

sophie-bates marked this pull request as draft May 30, 2023 01:24

sophie-bates added 4 commits May 31, 2023 16:56

feat: split build_as_code_check into subchecks and aggregate certaint…

8cc866d

…y values using ProbLog. Signed-off-by: sophie-bates <[email protected]>

refactor: specify build as code subcheck dependencies and invoke thro…

6bfaa0c

…ugh problog inference Signed-off-by: sophie-bates <[email protected]>

fix: update test_gha_workflow_deployment so that it passes the ci_par…

41372fa

…sed subcheck Signed-off-by: sophie-bates <[email protected]>

chore: convert problog result dictionary to use str keys

ca0a398

Signed-off-by: sophie-bates <[email protected]>

tromai reviewed Jun 7, 2023

View reviewed changes

feat: perform intermediate querying on deploy method subchecks to det…

6fcfbd1

…ermine which results to store Signed-off-by: sophie-bates <[email protected]>

behnazh-w reviewed Jun 7, 2023

View reviewed changes

sophie-bates added 16 commits June 9, 2023 11:34

feat: add sub-check for workflow trigger event type

35ae23b

Signed-off-by: sophie-bates <[email protected]>

chore: store workflow info object for each node in GitHub Actions cal…

c7502c6

…lgraph Signed-off-by: sophie-bates <[email protected]>

chore: check that deploy action doesn't have a repository url specified

e48003f

Signed-off-by: sophie-bates <[email protected]>

feat: add sub-check for test publish to pypi

0bc69bd

Signed-off-by: sophie-bates <[email protected]>

chore: restructure problog predicate functions

8c6f80d

Signed-off-by: sophie-bates <[email protected]>

chore: verify sub-check dependencies in ProbLog predicates rather tha…

9e80d79

…n sub-checks Signed-off-by: sophie-bates <[email protected]>

feat: add API client for PyPI

74da2c9

Signed-off-by: sophie-bates <[email protected]>

chore: get project name from poetry config file

d920307

Signed-off-by: sophie-bates <[email protected]>

chore: extract project name from pip config files

ee392e3

Signed-off-by: sophie-bates <[email protected]>

chore: setup PyPI registry_service with the PyPI API client

067a2c0

Signed-off-by: sophie-bates <[email protected]>

feat: implement sub-check to compare PyPI project timestamp with GHA …

6a04c43

…workflow timestamp Signed-off-by: sophie-bates <[email protected]>

chore: fix poetry is_detected logic to pass snapshots

ddb2d0c

Signed-off-by: sophie-bates <[email protected]>

chore: update poetry snapshot

7d9a3aa

Signed-off-by: sophie-bates <[email protected]>

chore: remove setup.py file parsing from pip build tool detection

08e58bd

Signed-off-by: sophie-bates <[email protected]>

chore: add evidence to BuildAsCodeTable and update ProbLog rules

03621ae

Signed-off-by: sophie-bates <[email protected]>

chore: fix repository_url check

ca00db4

Signed-off-by: sophie-bates <[email protected]>

sophie-bates added 7 commits June 20, 2023 10:50

chore: fix logging of sub-task results

c04d887

Signed-off-by: sophie-bates <[email protected]>

chore: update ProbLog rules likelihood values

24be921

Signed-off-by: sophie-bates <[email protected]>

feat: add sub-task to check for secrets used in same workflow step as…

5c641b8

… deployment method Signed-off-by: sophie-bates <[email protected]>

chore: store workflow_file in deploy_action and deploy_command checks…

2ebdea2

… for trigger event type sub-task Signed-off-by: sophie-bates <[email protected]>

chore: add tox -e release as supported deploy tool

a84c2e4

Signed-off-by: sophie-bates <[email protected]>

chore: include Poetry projects for deploy_action check

48ca217

Signed-off-by: sophie-bates <[email protected]>

chore: update release workflow trigger sub-task to penalize certainty…

44b36c8

… for particular event types Signed-off-by: sophie-bates <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add problog rule inference to build as code check #260

feat: add problog rule inference to build as code check #260

sophie-bates commented May 30, 2023 •

edited

Loading

tromai Jun 7, 2023

sophie-bates Jun 7, 2023

tromai Jun 7, 2023 •

edited

Loading

sophie-bates Jun 7, 2023

tromai Jun 7, 2023

behnazh-w Jun 7, 2023

sophie-bates Jun 8, 2023

behnazh-w Jun 8, 2023

sophie-bates Jun 8, 2023 •

edited

Loading

behnazh-w Jun 8, 2023

feat: add problog rule inference to build as code check #260

Are you sure you want to change the base?

feat: add problog rule inference to build as code check #260

Conversation

sophie-bates commented May 30, 2023 • edited Loading

Priority (v1)

Subchecks

Later

Notes:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tromai Jun 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sophie-bates Jun 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sophie-bates commented May 30, 2023 •

edited

Loading

tromai Jun 7, 2023 •

edited

Loading

sophie-bates Jun 8, 2023 •

edited

Loading